Data Science Demystified

image info

Source: [1]

When many people hear the term machine learning[2] they often leap to visions of dystopian futures and science fiction tropes. Our current media landscape is littered[3] with[4] examples[5] of[6]sensational[7] headlines[8] about the imminent dangers of artificial intelligence(AI), of which machine learning can generally be described as an applied subset. Zachary Lipton, an assistant professor at the machine learning department at Carnegie Mellon University, has dubbed this effect the "AI misinformation epidemic[9]", but you already encounter machine learning systems in benign circumstances every day! Whenever you send an email, pay with a credit card, or take a picture with your phone, you're using a machine learning system! Hopefully by the end of this post, you'll realize that behind the melodramatic discourse, machine learning is becoming a fun and exciting part of modern life that is increasingly important to understand.

Biological Learning

alt text

Source [10]

Take a look at the portrait above. What attributes do you notice? You might notice things the person's eye color, hairstyle, or expression, and make some inferences based on your prior experiences. You might guess this person identifies as male, that they are middle-aged, and that they appear to have a neutral expression. This is an impressive skill, how have you accomplished this?

Let's take a look at our age estimate, which elements of the portrait have led to your prediction that this person is middle-aged? We might notice the gray hair colour, the wrinkles, and the beard stubble as indicators of this persons age, and have classified[2] this person into an age bracket. What we have done is generate a prediction of age based on certain key features[2]! Naturally, the question arises of these features have led to our prediction; why are we so sure this isn't a baby? Well, We arrived at this result after spending our entire lives training[2], or practicing, by exposing ourselves to other human faces, and then testing[2] our knowledge by learning their (hopefully) true age. Extracting information from faces is an important skill humans have evolved over millions of years to be particularly good at[11], and we use this information to associate facial feature types with ages. In this example, we have learned through our training over time that gray hair is associated with more senior people through validating[2] that the people with gray hair we have seen are older.

Our prediction was middle-aged, but what if we wanted a specific age value? Well, we might take a guess, say 40 for example. How confident would you be in this prediction? If you're anything like me I wouldn't be very certain at all, but I think I would only be off by a few years. This uncertainty[2] is present in the world all around us, but our brain has developed a set of heuristics, or a mental model[2], to quickly provide estimates. We can think of our brain as a biological learning model!

Are you ready for surprise? Scroll back and look at that image again, and you may notice some irregularities. This image was generated by a machine learning model! A machine learning model has trained by viewing a large quantity of faces, learned through validation and testing which features represent a face, built a model to generate a prediction of a face! Exciting, right!?

Machine Learning

image info

Source [12]

Many people in the model world are familiar with spreadsheets, tables, and databases in their everyday work life. These are tables of rows and columns that represent data, specifically called tabular data. One simple example we can think of is the stock chart. A stock chart represents the price of a stock on the y axis, and the time value on the x-axis. We could represent this data in a tabular format with time and price as columns, and specific values as rows.

image info

Source [13]

What does this have to do with machine learning? Well, what if you wanted to predict the next stock price? You would be predicting the price using a model trained with previous stock information given the values of the feature! Machine learning is using statistical models in an attempt to replicate the process you use to learn things about the everyday world.

I'm sure you can imagine plenty of other situations in your work or home life where there are tables of data and important values to estimate and classify. What about whether or not bid on a contract at work? Given a wealth of information about the client, you would be attempting to determine the specific features of that client, such as contract price, location, and data, and use past contract data to build a model trained with prior data as a test. This past data would help you build this model to predict a classification of the project result. This is the same process we saw your brain use to recognize faces, but instead we've used a computer to translate this into 1s and 0s, spitting out a yes or no.

Machine learning isn't as scary as the media makes it out to be. It's simply computers using applied statistical analysis to predict trends based on past data. There are a myriad of projects to utilize all of the data in the modern world, and hopefully, if used ethically make the world a better place. I hope you've taken these examples to heart, and are excited to learn about machine learning projects in your everyday life!

References

terminator

Reference Number: 1 Title: Terminator will return in 2019 with the help of James Cameron Author: Andrew Liptak Source: https://www.theverge.com/2017/9/27/16374734/terminator-sequel-release-date-2019-james-cameron-tim-miller-movie

wikipedia

Reference Number: 2 Author: Title: Wikipedia Source: https://en.wikipedia.org

vox

Reference Number: 3 Title: Stephen Hawking’s final warning for humanity: AI is coming for us Author: Abigail Higgins Source: https://www.vox.com/future-perfect/2018/10/16/17978596/stephen-hawking-ai-climate-change-robots-future-universe-earth

financial-times

Reference Number: 4 Title: Google’s AI fest offers an ominous glimpse of the robot future Author: Richard Waters Source: https://www.ft.com/content/99635a6e-540b-11e8-b3ee-41e0209208ec

bbc

Reference Number: 5 Title: Are you scared yet? Meet Norman, the psychopathic AI Author: Jane Wakefield Source: https://www.bbc.com/news/technology-44040008

nypost

Reference Number: 6 Title: Creepy Facebook bots talked to each other in a secret language Author: Chris Perez Source: https://www.bbc.com/news/technology-44040008

cnbc

Reference Number: 7 Title: Elon Musk: ‘Mark my words — A.I. is far more dangerous than nukes’ Author: Catherine Clifford Source: https://www.cnbc.com/2018/03/13/elon-musk-at-sxsw-a-i-is-more-dangerous-than-nuclear-weapons.html

tech-review

Reference Number: 8 Title: Should a self-driving car kill the baby or the grandma? Depends on where you’re from. Author: Karen Hao Source: https://www.technologyreview.com/2018/10/24/139313/a-global-ethics-study-aims-to-help-ai-solve-the-self-driving-trolley-problem/

guardian

Reference Number: 9 Title: 'The discourse is unhinged': how the media gets AI alarmingly wrong Author: Oscar Schwartz Source: https://www.theguardian.com/technology/2018/jul/25/ai-artificial-intelligence-social-media-bots-wrong

fake-face

Reference Number: 10 Title: UBC MDS DSCI 572: Supervised Learning II, Lecture 8 Author: Tomas Beuzen

scientific-american

Reference Number: 11 Title: Superior Face Recognition: A Very Special Super Power Author: Anna K. Bobak Source: https://www.scientificamerican.com/article/superior-face-recognition-a-very-special-super-power/

stonks

Reference Number: 12 Title: What We Talk About When We Talk About Stonks Author: Jordan Weissman Source: https://slate.com/business/2021/01/stonks-not-stocks-got-it.html

stock-predict

Author: Ryan Gillard Reference Number: 13 Title: UBC MDS DSCI 572: Supervised Learning II, Lecture 8 Source: https://strategicfocus.com/2019/08/30/how-to-quickly-solve-machine-learning-forecasting-problems-using-pandas-and-bigquery/